Sorted Kernel Matrices as Cluster Validity Indexes
نویسندگان
چکیده
Two basic issues for data analysis and kernel-machines design are approached in this paper: determining the number of partitions of a clustering task and the parameters of kernels. A distance metric is presented to determine the similarity between kernels and FCM proximity matrices. It is shown that this measure is maximized, as a function of kernel and FCM parameters, when there is coherence with embedded structural information. We show that the alignment function can be maximized according FCM and kernel parameters. The results presented shed some light on the general problem of setting up the number of partitions in a clustering task and in the proper setting of kernel parameters according to structural information. Keywords— Affinity matrix, Clustering, Fuzzy C-Means (FCM), Kernel matrix, Reordering, Sorting.
منابع مشابه
A cluster validity index for fuzzy clustering
Cluster validity indexes have been used to evaluate the fitness of partitions produced by clustering algorithms. This paper presents a new validity index for fuzzy clustering called a partition coefficient and exponential separation (PCAES) index. It uses the factors from a normalized partition coefficient and an exponential separation measure for each cluster and then pools these two factors t...
متن کاملComposite Kernel Optimization in Semi-Supervised Metric
Machine-learning solutions to classification, clustering and matching problems critically depend on the adopted metric, which in the past was selected heuristically. In the last decade, it has been demonstrated that an appropriate metric can be learnt from data, resulting in superior performance as compared with traditional metrics. This has recently stimulated a considerable interest in the to...
متن کاملSome new indexes of cluster validity
We review two clustering algorithms (hard c-means and single linkage) and three indexes of crisp cluster validity (Hubert's statistics, the Davies-Bouldin index, and Dunn's index). We illustrate two deficiencies of Dunn's index which make it overly sensitive to noisy clusters and propose several generalizations of it that are not as brittle to outliers in the clusters. Our numerical examples sh...
متن کاملSum-of-Squares Based Cluster Validity Index and Significance Analysis
Different clustering algorithms achieve different results to certain data sets because most clustering algorithms are sensitive to the input parameters and the structure of data sets. Cluster validity, as the way of evaluating the result of the clustering algorithms, is one of the problems in cluster analysis. In this paper, we build up a framework for cluster validity process, meanwhile a sum-...
متن کاملAssessing the Quality of Fuzzy Partitions Using Relative Intersection
In this paper, conventional validity indexes are reviewed and the shortcomings of the fuzzy cluster validation index based on intercluster proximity are examined. Based on these considerations, a new cluster validity index is proposed for fuzzy partitions obtained from the fuzzy c-means algorithm. The proposed validity index is defined as the average value of the relative intersections of all p...
متن کامل